Efficient Top-k Query Processing Algorithms in Highly Distributed Environments

نویسندگان

Qiming Fang

Guangwen Yang

چکیده

Efficient top-k query processing in highly distributed environments is a valuable but challenging research topic. This paper focuses on the problem over vertically partitioned data and aims to propose more efficient algorithms.. The effort is put on limiting the data transferred and communication round trips among nodes to reduce the communication cost of the query processing. Two novel algorithms, BulkDBPA and 4RUT, are proposed. BulkDBPA is derived from the centralized algorithm BPA2 which requires very low data access. BulkDBPA borrows the idea of best position from BPA2 and so has the advantage of low data transferred. It further reduces the communication round trips by utilizing bulk read and bulk transfer mechanism. 4RUT is inspired by the algorithm TPUT which only requires three communication round trips to get the exact top-k results. 4RUT improves its top-k lower bound estimate by introducing one additional communication round trip, which can subsequently reduce the data transferred in query processing. Experimental results show that both BulkDBPA and 4RUT require much less data transferred and response time than the competitors including Simple Algorithm and TPUT and each has its own suitable application environments respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient top-k processing in large-scaled distributed environments

The rapid development of networking technologies has made it possible to construct a distributed database that involves a huge number of sites. Query processing in such a large-scaled system poses serious challenges beyond the scope of traditional distributed algorithms. In this paper, we propose a new algorithm BRANCA for performing top-k retrieval in these environments. Integrating two orthog...

متن کامل

SPARQL Query Optimization on Top of DHTs

We study the problem of SPARQL query optimization on top of distributed hash tables. Existing works on SPARQL query processing in such environments have never been implemented in a real system, or do not utilize any optimization techniques and thus exhibit poor performance. Our goal in this paper is to propose efficient and scalable algorithms for optimizing SPARQL basic graph pattern queries. ...

متن کامل

Ad-hoc Top-k Query Answering for Data Streams

A top-k query retrieves the k highest scoring tuples from a data set with respect to a scoring function defined on the attributes of a tuple. The efficient evaluation of top-k queries has been an active research topic and many different instantiations of the problem, in a variety of settings, have been studied. However, techniques developed for conventional, centralized or distributed databases...

متن کامل

Top-k aggregation queries in large-scale distributed systems

Distributed top-k query processing has become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed topk queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers.

متن کامل

KLEE: A Framework for Distributed Top-k Query Algorithms

This paper addresses the efficient processing of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers and the computational costs include network latency, bandwidth consumption, and local peer work. We present KLEE, a novel algorithmic framework for distributed top-k queri...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 9 شماره

صفحات -

تاریخ انتشار 2014

Efficient Top-k Query Processing Algorithms in Highly Distributed Environments

نویسندگان

چکیده

منابع مشابه

Efficient top-k processing in large-scaled distributed environments

SPARQL Query Optimization on Top of DHTs

Ad-hoc Top-k Query Answering for Data Streams

Top-k aggregation queries in large-scale distributed systems

KLEE: A Framework for Distributed Top-k Query Algorithms

عنوان ژورنال:

اشتراک گذاری